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Abstract 

Purpose. To compare diagnostic performance of students of five differ- 
ent levels of training, educated in either a problem-based, an integrative, 
or a conventional curriculum. Method. Data were analyzed from 612 
students diagnosing 30 cases which were epidemiologically representa- 
tive for Dutch society and covered all organ systems. Number of accu- 
rate diagnostic hypotheses were tallied for each of the groups involved. 
The data were analyzed using analysis of variance (ANOVA) and post- 
hoc Newman-Keuls tests. Results. Overall, students trained within the 
problem-based framework and students trained within the context of an 
integrated curriculum displayed better diagnostic performance than stu- 
dents trained within a conventional curriculum. No overall differences 
were found between the problem-based and the integrated curriculum, 
although second- and third-year students from the latter excelled the 
comparable year groups in the two other schools involved. Conclusion. 
It was concluded that integration between basic and clinical sciences and 
an emphasis on patient problems may be the critical factors deterrniniiig 
superior diagnostic performance rather than whether a curriculum is self- 
or teacher-directed. Problem-based learning seems to live up to its ex- 
pectancies, but so does the integrated approach to medical education. It 
was also concluded that the procedure for measuring diagnostic per- 
formance appears to be valid and provides a simple means of measur- 
ing curriculum effects. It remains to be seen whether the response pat- 
terns found would be replicated when subjects are allowed to freely ex- 
plore the problem situation. 



One of the original reasons for promoting problem-based learning (PBL) as an 
approach to medical education was, that students would be in a better position to 

1-2 

learn how to solve medical problems. Barrows, one of the early proponents of PBL, 
assumes that, through continuous exposure to real-life problems, and modeled by their 
tutor, students would acquire the craft of evaluating a patient's problem, deciding 
what's wrong and making decisions about appropriate actions to treat or manage the 
problem. In his view, fostering clinical reasoning or problem-solving skills is a primary 
goal of PBL, a goal not sufficiently emphasized in more traditional approaches to 
medical education. The assumption here is that PBL facilitates the acquisition of diag- 
nostic reasoning skills to a larger extent than conventional medical education. 

Others are more skeptical. Schmidt and colleagues, for instance, argue that 
most of the medical expertise literature suggests that medical problem-solving is case- 
specific to an extent that the existence of knowledge-independent clinical reasoning 

skills can be seriously questioned (see also Elstein and colleagues ). If clinical reason- 
ing skills do not exist independent of knowledge, they cannot be taught in a direct 

fashion. What, then, is the role of PBL in this respect? Norman puts it this way: "If 
the game is not to teach the problem-solving process, how does one justify the use of 
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clinical problems as the central feature of a curriculum? The answer is straightforward. 
PBL is simply a case of learning 'stuff as students work their way through a clinical 
problem. In general, the 'stuff is unspecified. Some of it is the usual stuff of medicine 
- Krebs cycles and Starling Laws. However, the problem is unbounded, and the stuff 
also encompasses epidemiology, psychology, pharmacology, and just about any other - 
ology available in medical, behavioral or social science (p. 282)." Boshuizen and 

Schmidt 6 argue that the ability to solve a patient's problem may emerge as a by- 
product of the attempt at comprehending the multiple ways in which the human body 
functions and dysfunctions. Therefore, whether PBL would lead to better diagnostic 
performance would depend to a large extent on the quality, comprehensiveness and 
thoroughness of the knowledge acquisition process. These authors do not exclude the 
possibility, however, that mere exposure to case-histories may affect recognition of 
particular diseases in similar case-histories. Since students in PBL generally see more 
case-histories than students in conventional medical education (simply because cases 
are the stimuli for most of their learning), this may produce superior diagnostic per- 

g 

formance on similar case-histories. Hmelo, for instance, found a positive effect of 
previous cases discussed in a problem-based curriculum on subsequent diagnostic per- 
formance on similar ones. This implies that Barrows may be right, but for a different 
reason. 

What is the evidence in favor for each of these positions? To what extent do 
students from PBL schools perform better — or in other ways differently — on diagnos- 
tic tasks as compared to students from more traditional denominations? Three studies 
address this issue in some detail. 

Patel, Groen, and Norman 9 asked subjects from a conventional and a prob- 
lem-based curriculum to solve a clinical problem and integrate three passages of rele- 
vant basic science knowledge into their explanation of the problem. The students from 
the problem-based curriculum advanced many more causal explanations than the stu- 
dents from the conventional curriculum. However, although the students from the 
problem-based curriculum did produce a large number of causal explanations, many 
were incorrect. 

In a study of the effects of curriculum type on knowledge integration, Boshui- 
zen and her colleagues 10 compared the performances of students from two medical 
schools; one problem-based and one conventional. These (preclinical) students were 
asked to explain how a specific metabolic deficiency and a specific disease could be 
related, e.g., "How does a genetic deficiency of pyrovate kinase lead to haemolytic 
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anemia?" In answering this question, knowledge about biochemistry and about inter- 
nal medicine must be applied and integrated. Students from the problem-based cur- 
riculum appeared to take an analytical approach to the problem by first exploring the 
biochemical aspects of the problem, later linking them to clinical aspects. Students in 
the conventional curriculum tended toward a more memory-based approach. They 
searched their memories to find a direct answer to the question. This strategy, how- 
ever, resulted in significantly less accurate answers and more failures by the students 
from the conventional program. 

A third study was recently completed by Cindy Hmelo. At three points in 
time, she compared diagnostic performance of about 40 Rush medical students, who 
were either participating in a conventional track or a PBL track. Over the course of a 
year, the preclinical subjects were presented with three times twc cases. They were re- 
quested to produce a diagnosis and an explanation of the signs and symptoms pro- 
vided in each case in terms of their underlying pathophysiology. Accuracy of diagnos- 
tic hypotheses produced by PBL students increased linearly over time whereas the 
students from the conventional track did not show different performances at the three 
measurement points. Hmelo concludes that, in the course of the year, students from 
the PBL track were able to apply the biomedical knowledge acquired to the clinical 
cases whereas the other students failed to do so. As indicated above, prior encounters 
of similar cases by the PBL group influenced the results, but the data indicated that 
case recognition did not account entirely for the difference between both groups. There 
was also a beneficial effect of PBL beyond the experience it provides with specific 
cases. In addition, the PBL students showed more coherence in their pathophysiologi- 
cal explanations as measured by the length of their reasoning chains. 

To say that these three studies point in the same direction would be an over- 
statement. Although in all studies PBL students produced more causal explanations, 
in only two of these studies these causal pathophysiological explanations were also of 
better quality. In Hmelo's study, the PBL students came up with more accurate diag- 
noses whereas in the Patel et al. study, the PBL students performed poorer than those 
from a traditional curriculum. 

There may be several reasons for these inconsistencies. The first is that the 
number of students used in these three studies was fairly limited. The Boshuizen et 

al. 10 study employed for instance no more than flight students, four from each school. 

Patel et al. employed 72 students who were, however, assigned to six different ex- 
perimental conditions. Although statistical tests take into account small numbers (the 
smaller the number of subjects, the stronger the experimental effect must be), and the 
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use of small samples are fairly common practice in cognitive psychology research, the 
fairly global nature of the treatment (PBL versus non-PBL) in combination with sam- 
pling errors may account for the inconsistencies. (A number of studies conducted in 
the US have taken a more molar approach by comparing performance of larger groups 
of students from traditional and problem-based curricula on the clinical examinations 

of the National Board of Medical Examiners. 11 " 12 These studies have, generally, shown 
students from problem-based schools to do somewhat better on the clinical part of the 
NBME, and somewhat poorer on the basic science part. It can, however, be argued to 
what extent these examinations measure problem-solving skill or diagnostic perform- 
ance.) 

A second reason may be that different programs may employ different admis- 
sion criteria which make groups dissimilar to begin with. Although the Rush students 
were similar on a number of characteristics such as MCAT scores, it is hard to believe 
that their preference for either the PBL or the conventional track is the result of pure 
chance and has nothing to do with differences in personality or other characteristics of 
the students involved. The McMaster and McGill students, compared by Patel and her 
colleagues, are known h have different background characteristics due to different 
admission criteria. 

A third, more important, reason why the findings are difficult to interpret may 
be the small number of clinical cases employed. As stated before, one of the most con- 
sistent findings in the medical expertise arena has been that diagnostic performance is 
to a large extent case-specific. Performance bv physicians or students on one or a few 
cases does poorly predict their performance on other cases. Therefore, performance as 
observed in the experiments discussed may have depended to a large extent on the 
particular cases selected, which may have favored one group or the other. A remedy 
would be to increase the number of cases. 

In the present study, diagnostic performance of 612 students from three Dutch 
medical schools was compared: A problem-based school, a school with an integrated, 
but teacher-driven curriculum, and a school with a conventional, discipline- and lec- 
ture-based curriculum. The subjects were presented with 30 carefully selected clinical 
cases, in an attempt to avoid possible bias caused by case specificity. Li addition, the 
study profited from a unique feature of the Dutch allotment system: Students are ad- 
mitted to the different medical schools through a lottery procedure in which academic 
achievement plays an important role, whereas aptitude for a particular instructional 
approach does not. This feature enhances the opportunity for meaningful comparisons 
to be made. 
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Method 

Subjects 

Subjects were 612 second-, third-, fourth-, fifth-, and sixth-year medical stu- 
dents of three Dutch medical schools, approximately 40 per curriculum year and per 
medical school. The subjects received a small enumeration for their participation. 

The curricula compared 

The University of Limburg medical school in Maastricht has an established 
problem-based curriculum since the early seventies. It was, in fact, the second school in 
the world that adopted the problem-based approach. Students meet twice a week for 
small-group discussion of problems. In addition, they participate in a limited number 
of lectures, lab activities and - more extended - training in interpersonal and physical 
examination skills. The rest of the time is scheduled for self-directed learning activi- 
ties. The University of Amsterdam experiments with an integrated curriculum, in 
which small-group teaching plays a role. It has, however, more structuring elements in 
the form of lectures, labs, et cetera, than the Maastricht curriculum. In addition, stu- 
dents are not considered to be self-directed; chapters, books and articles are pre- 
scribed. The University of Groningen medical school curriculum can be characterised 
as conventional, discipline-oriented and teacher-centered. The study was completed 
just before the latter institution embarked upon a new, largely patient-oriented and in- 

■to 

tegrated curriculum. * Medical curricula in the Netherlands take six years and consist 
of four years of preclinical and two years of clinical training. 

Materials 

The materials consisted of 30 short case-histories, each approximately half-a- 
page long, that covered all organ systems and were epidemiologically representative for 
the kind of diseases prevalent in Dutch society. Each of the cases included the presen- 
tation of a patient and his or her complaints, physical examination findings, and labo- 
ratory results whenever appropriate. A list of normal (lab) values was included. Hie 
cases were bundled in a 17-page booklet. The following case is a representative exam- 
ple : "A 65-year old lady visits her family physician. She enters your surgery room 
with red eyes suggesting that she has been crying. She tells you that she worries a lot 
because she looses so much weight. After you have calmed her down, she tells you in a 
cascade of words that she has lost 12 kilogram, although she eats well. She worries 
about this state of affairs very much, sleeps poorly and if restless and agitated. She 
does not take any drugs. Her family history displays nothing unusual. Upon physical 
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examination you find a sick, restless woman with a sweaty, warm skin. The thyroid 
gland is diffusely enlarged. Blood pressure 150/89; pulse rate 140/min irregu- 
lar/unequal. The legs show pitting edema. The heart is enlarged and a souffle sug- 
gesting mitral insufficiency is heard. Lab data: T4 300 nmol/1, T3 lOnmol/1, TSH 0.05 
mU/1. ECG: atrium fibrillation accompanied by a high ventricle frequency." 
Table 1 contains the diagnoses of the 30 cases included. 

Table 1. Diagnoses underlying the cases presented 

Tase 1. Hyperthyroidism 

Case 2. Subdursi hematoma 

Case 3. Paralysis agitans (= Parkinson's disease) 

Case 4. Polyneurapathy 

due to Diabetes mellitus * 
Case 5. Myasthenia gravis 
Case 6. Anlylosing spondylitis 
Case 7. Tenosynovitis 
Case 8. Polymyalgia rheumatica 
Case 9. Pyelitis 
Case 10. Renal cell carcinoma 
Case 1 1 . Bladder carcinoma 
Case 12. Acute glomerulonephritis 
Case 13. Pneumothorax 

Case 14. COPD (Chronic obstructive pulmonary disease) 

- with an allergic component * 

- with a hyperreactive component * 
Case 15. Pneumococcal pneumonia 

Case 16. Congestive heart failure right- and left-sided 
Case 17. Cardiac asthma with atrial fibrillation 

- with mitral regurgitation * 

- with tricuspidalis regurgitation * 
Case 18. Myocardial infarction 

Case 19. Hepatitis B 

Case 20. (Acute) Pancreatitis 

- due to gall stones * 

- due to biliary obstruction * 
Case 21. Reflux (esophagitis) 

Case 22. Melanoma 

Case 23. Psoriasis (vulgaris) 

Case 24. (Seborrheic) dermatitis 

Case 25. Otoscelerosis 

Case 26. Salpingitis 

Case 27. Endometriosis (externa) 

Case 28. Ovary cysts 

Case 29. Laryngeal carcinoma 

Case 30. Appendicitis 

Additional credit points were awarded for information indicated with an asterisk; omission 
of information between brackets did not influence the accuracy rating of the diagnosis. 



Procedure 

Subjects were run in small groups of varying magnitude. They were requested 
to read each case and provide a differential diagnosis if they could. If they were un- 
able to come up with a specific diagnosis, they were allowed to state which organ 
(system) seemed to be affected or which pathophysiological mechanism seemed to be 
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involved. They were encouraged not to spend too much time to each of the cases. 
Subjects were given sufficient time to complete the test. The following scoring system 
was used: If the correct diagnosis appeared as the most likely one in the differential 
diagnosis, the answer was awarded 2 credit points. If the correct diagnosis appeared 
as part of a differential diagnosis, but not in first position, the answer was awarded 1 
credit point. The accurate diagnoses for cases 4, 14, 17 and 20 could contain one or 
two additional elements which were each credited with one additional point. The 
maximum score for the test as a whole, therefore, was equal to 67. Interrater agreement 
exceeded 90%. The resulting data were analyzed using ANOVA. 

Results 

A statistically significant effect of curriculum type on diagnostic performance 
was found, F (2, 597) = 14.40, p < .0001, MS e = 535.42. In addition, an effect of cur- 
riculum year on performance was demonstrated, F (4, 597) = 457.49, p < .0001, MS e = 

17007.16. Finally, both variables interacted with each other, F (8, 597) = 3.795, p < 
.001, MS e = 141.09. Table 2 contains average diagnostic scores and standard devia- 
tions for each of the schools and all levels of training involved. Figure 1 displays the 
diagnostic scores visually. 

Table 2. Averag e diagnostic scores by five levels of expertise in three Dutch medical schools 



Level of Expertise Maastricht Amsterdam Groningen 





(Problem-based) 


(Integrative) 


(Conventional) 






Mean 


SD 


Mean 


SD 


Mean 


SD 


Year 2 


6.33 


3.86 


10.37 


4.18 


7.30 


3.74 


Year 3 


14.49 


7.25 


20.95 


5.63 


14.49 


5.90 


Year 4 


24.29 


6.49 


23.69 


6.88 


23.73 


6.75 


Year 5 


31.13 


6.95 


30.09 


8.16 


26.93 


6.66 


Year 6 


39.66 


6.87 


39.83 


5.84 


36.25 


4.97 
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Figure 1. Average diagnostic performance as a function of school and curriculum year, 
Diagnostic Perform ance 




5- 



0 J 1 1 I 1 1 

year2 year3 year 4 year5 year6 

Curriculum Year 

Post-hoc Student-Newman-Keuls tests revealed that, overall, students from 
the conventional Groningen medical school performed poorer than those of the other 
two schools. Comparing means in each year group shows that students from the inte- 
grated Amsterdam curriculum performed sigrtiicantly better than the other two groups 
in the second and third curriculum year, whereas students from the problem-based 
Maastricht curriculum performed better than the students from the conventional cur- 
riculum in year 5, but not better than the students from the intej rated curriculum. Stu- 
dents from the integrated curriculum did also perform better than those of the conven- 
tional curriculum. Differences between adjacent curriculum years within each of the 
schools were all statistically significant. 
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Discussion 

The findings presented in this article constitute, to our knowledge, the first 
large-scale study that compares performance of medicsl students from different curric- 
ula under controlled conditions. The cases presented were epidemiologically represen- 
tative for Dutch society and covered the major organ systems. The number of cases to 
be diagnosed was much larger than those included in similar studies, in an attempt to 
avoid outcomes biased by case specificity. In addition, the number of students in- 
volved and the five levels of training included also represent a departure from existing 
practices. 

We will first discuss differences between the problem-based and the conven- 
tional program. Subsequently we will deal with the data comparing the problem-based 
and the integrated curriculum and their implications. 

The students trained within the context of a problem-based curriculum 
showed better diagnostic performance than the students from the conventional cur- 
riculum. A significant overall effect of curriculum type was found. At the end of the 
six years, the Maastricht students performed almost 9% better than the Groningen 
comparison group. The question is, of course, whether these 9% represent a meaningful 
portion. Expressed in terms of accuracy of diagnostic performance this percentage 
means that the Maastricht students on average diagnosed 1.5 out of 30 cases more ac- 
curately than the students from the conventional curriculum. Assuming that these stu- 
dents will actually see about thirty patients each day in the coming years and assuming 
that our findings signify a difference in actual diagnostic expertise between students 
from both schools (rather than just an effect on a written test), the difference soon be- 
comes sizable. After only one month, a Groningen graduate would have missed on av- 
erage 37.5 diagnoses not missed by a Maastricht graduate. Of course, this kind of 
reasoning ignores possible compensation effects occurring during further training and 
practice. In addition, it assumes ~ perhaps uncritically — that performance on a pa- 
per-and-pencil test can be generalized to performance in professional practice without 
much ado. Nevertheless, it shows that even relatively small effects of curriculum type, 
when extrapolated, may aifect the quality of every-day diagnostic performance in non- 
trivial ways. Interestingly, although the findings represent a curriculum main effect, the 
differences become only apparent in the clerkship years. It is not clear why this is so. 
This may imply that effects of problem-based learning are the result of an incubation- 
type of process: They appear only when students begin to deal with real patients in 
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the academic hospital or outside. Alternatively, it may simply imply that the Maas- 
tricht clerkship is more effective than the Groningen one. The latter explanation is, 
however, less likely, because the first measurement on which significant differences be- 
tween curricula appeared, was taken early in the clerkship phase. 

No overall differences were found between the integrated teacher-directed and 
the student-centered curriculum. The students in the Amsterdam curriculum performed 
better in the second and third year. That the study was cross-sectional rather than 
longitudinal blurs this finding because the integrated curriculum studied was imple- 
mented only in 1990. Hence, year 5 and 6 students were trained under the old, tradi- 
tional, regimen. This makes it difficult to draw substantive conclusions about differ- 
ences between the integrated, yet fairly teacher-centered, approach and the problem- 
based approach. Let's assume, however, for the time being, that the lack of difference 
overall represents a "true" curriculum effect. N * 1 The question, then, is, what do the 
problem-based and the integrated curriculum have in common such, that their effects 
on students are similar, and what distinguishes them from the third, conventional cur- 
riculum? A tentative answer would be the fact that the problem-based and the inte- 
grated curriculum both offer subject-matter to students in an integrated fashion, and 
that students are encouraged to process the information in an active way through 
small-group discussion. Thus, subject-matter integration and active processing seem 
more important factors in attaining proficiency in diagnostic reasoning than the amount 
of self-directness of a curriculum. (Self-directed learning, to be fair, has never been 
claimed to facilitate the acquisition of diagnostic skills. It is primarily advocated to 
help students acquire the skills for life-long, self-driven learning. 2 ) 

Where to go from here? 

Some claim that presenting students with pre-packaged clinical information, 

as we have done, is insufficient to study their clinical reasoning skills. 14 The hallmark 
of diagnostic reasoning is free inquiry; subjects should be put in a position in which 
they should gather the information in open interaction with the patient. Although pre- 
vious experiments with free data gathering have generally shown that this approach 
does not contribute to the validity of distinctions between expert and less-expert diag- 
nosticians, it may be worthwhile to pursue this issue once again. In the past, data 



As the first author has argued elsewhere, 16 trying to attribute a curricular effect to par- 
ticular elements of the curricula compared is extremely complicated. Curriculum effect 
studies can be compared to clinical trials spanning several years in which the subjects of 
unknown background are submitted to treatments of which the effective elements are un- 
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gathering has been studied focusing mainly on formal characteristics of the process. 

This was in line with the spirit of that time. 7 An approach more geared toward the 
contents of the interaction between a diagnostician and his patients may unravel pat- 
terns not observed before. 15 

A second issue to be clarified, is to what extent the present procedure used 
for comparing students from different curricula is sufficiently sensitive to smaller-scale 
course effects. It is clear that the procedure has more than acceptable discriminant va- 
lidity; the set of 30 case-histories produced significant differences between all levels of 
expertise within each of the schools. But would the procedure enable measuring effects 
of, say, a course on the cardiovascular system? Do students better on cases relevant 
to that system after they have completed the particular course? If so, the approach 
would not only be useful to measure student progress over the years but would also be 
a useful instrument for program evaluation. A third issue, finally, is to what extent 
performance on the diagnostic tasks is related to basic-science and clinical knowledge 
related to these tasks. Research is in progress to answer these questions. 
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